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q ! Abstract 

Maximum-likelihood decoding is one of the central algorithmic problems in cod- 
^ ■ ing theory. It has been known for over 25 years that maximum-likelihood decoding 

of general linear codes is NP-hard. Nevertheless, it was so far unknown whether 
maximum- likelihood decoding remains hard for any specific family of codes with 
nontrivial algebraic structure. In this paper, we prove that maximum-likelihood 
decoding is NP-hard for the family of Reed-Solomon codes. We moreover show 
that maximum-likelihood decoding of Reed-Solomon codes remains hard even 
with unlimited preprocessing, thereby strengthening a result of Bruck and Naor. 
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1. Introduction 



Maximum-likelihood decoding is one of the central (perhaps, the central) algorithmic prob- 
lems in coding theory. Berlekamp, McEliece, and vanTilborg [4] showed that this problem 
is NP-hard for the general class of linear codes. More precisely, the corresponding decision 
problem can be formally stated as follows. 

Problem: Maximum-Likelihood Decoding of Linear Codes (MLD-Linear) 
Instance: An m x n matrix H over F^ , a target vector s G F™, and an integer w > 0. 
Question: Is there a vector v G F^ of weight ^ w, such that Hv* = s f ? 

Berlekamp, McEliece, and van Tilborg [4| proved* in 1978 that this problem is NP-complete 
using a reduction from Three-Dimensional Matching, a well-known NP-complete 
problem [9 , p. 50]. Since 1978, the complexity of maximum-likelihood decoding of general 
linear codes has been extensively studied. Bruck and Naor |J5| and Lobstein [ 16 1 showed 
in 1990 that the problem remains hard even if the code is known in advance, and can 
be preprocessed for as long as desired in order to devise a decoding algorithm. Arora, 
Babai, Stern, and Sweedyk J3 proved that MLD-Linear is NP-hard to approximate within 
any constant factor. Downey, Fellows, Vardy, and Whittle [7| proved that MLD-Linear re- 
mains hard even if the parameter w is a constant — it is not fixed-parameter tractable unless 
FPT = W[l] . Recently, the complexity of approximating MLD-Linear with unlimited pre- 
processing was studied by Feige and Micciancio 1 8 1 and by Regev |[T9l — this work streng- 
thens the results of both 0HH and 0] by showing that MLD-Linear is NP-hard to approx- 
imate within a factor of 3 — e for any e > 0, even if unlimited preprocessing is allowed. 

While the papers surveyed in the foregoing paragraph constitute a significant body of work, 
all these papers deal with the general class of linear codes. This leads to a somewhat in- 
congruous situation. On one hand, there is no nontrivial useful family of codes for which 
a polynomial-time maximum-likelihood decoding algorithm is known (such a result would, 
in fact, be regarded a breakthrough). On the other hand, the specific codes used in the re- 
ductions of E SI EJ |7J [HI HU H3 are unnatural, and the problem of showing NP-hardness 
of maximum-likelihood decoding for any useful class of codes with nontrivial algebraic 
structure remains open, despite repeated calls for its resolution. For example, the survey of 
algorithmic complexity in coding theory [22| says: 

Although we have, by now, accumulated a considerable amount of results on the hardness 
of Maximum-Likelihood Decoding, the broad worst-case nature of these results is still 
somewhat unsatisfactory. [...] Thus it would be worthwhile to establish the hardness of Max- 
imum-Likelihood DECODING in the average sense, or for more narrow classes of codes. 

The first step along these lines was taken by Alexander Barg [2, Theorem 4], who showed 
that maximum-likelihood decoding is NP-hard for the class of product (or concatenated) 



*Note that Maximum-Likelihood Decoding of Linear Codes is NP-complete over all finite fields ¥ q . 
Berlekamp, McEliece, and vanTilborg |4| only proved this result for the special case q = 2. The easy 
extension to arbittary prime powers can be found, for instance, in 1 2 Proposition 2] . 
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codes, namely codes of type C = A ® B, where A and B are linear codes over F^ . Barg 
writes in [2] that this result is 

... the first statement about the decoding complexity of a somewhat more restricted class 
of codes than just the "general linear codes." 

Observe, however, that the code C = A® B does not have any algebraic structure unless 
A and B are further restricted in some manner. Furthermore, it is intuitively clear that the 
decoding problem for this code cannot be much simpler than the decoding problem for its 
factors A and B, which are, again, general linear codes. 

In this paper, we prove that maximum-likelihood decoding is NP-hard for the family of 
Reed-Solomon codes. Let q = 2 m and let Fq [X] denote the ring of univariate polynomials 
over F^ . Reed-Solomon codes are obtained by evaluating certain subspaces of F^ [X] in 
a set of points T> = {x\, Xj_, ■ ■ ■ , x n } which is a subset of Wq. Specifically, a Reed-Solo- 
mon code Cn(T>, k) of length n and dimension k over F^ is defined as follows: 

Cq(V,k) = f { (/(zi),..., /(*„)) : x lf ... f x n eV f f{X)e¥ q [X] f deg/(X)<fc} 

Thus a Reed-Solomon code is completely specified in terms of its evaluation set V and 
its dimension k. As in 0, we assume that if a codeword of Cq(T>,k) is transmitted and 
the vector y £ F^ is received, the maximum-likelihood decoding task consists of comput- 
ing a codeword c £ £-q{T>, k) that minimizes d(c, y), where d(-, •) denotes the Hamming 
distance. The corresponding decision problem can be formally stated as follows. 

Problem: Maximum-Likelihood Decoding of Reed-Solomon Codes 

Instance: An integer m > 0, a set T> = {x\, Xj_, ■ ■ ■ , x n } consisting of n distinct el- 
ements of F 2 ,„ , an integer k > 0, a target vector y £ F^,,, , and an integer w > 0. 

Question: Is there a codeword c £ C2«< {V, k) such that d{c, y) ^ wl 

We will refer to this problem* as MLD-RS for short. Our main result herein is that MLD-RS 
is NP-complete. Note that the formulation of MLD-RS is restricted to Reed-Solomon codes 
over a field of characteristic 2. However, our proof easily extends to Reed-Solomon codes 
over arbitrary fields: we use fields of characteristic 2 for notational convenience only. The 
key idea in the proof is a re-interpretation of the result that was derived in ll23l Lemma 1] 
in order to establish NP-hardness of computing the minimum distance of a linear code. 

It is particularly interesting that the only nontrivial family of codes for which we can now 
prove that maximum-likelihood decoding is NP-hard is the family of Reed-Solomon codes. 
Decoding of Reed-Solomon codes is a well-studied problem with a long history. There are 
well-known polynomial-time algorithms that decode Reed-Solomon codes up to half their 
minimum distance EJ[TD1[TH]|, and also well beyond half the minimum distance lfT2*ll2"Tl . 



*In the definition of MLD-RS, the field elements of ¥ 2 m are assumed to be represented by m-bit vectors. 
Therefore the input size of an instance of MLD-RS is polynomial in n and m. 
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Nevertheless, all these algorithms fall in the general framework of bounded-distance de- 
coders ll2"2"ll . Our result shows that assuming a bound on the number of correctable errors, 
as these algorithms do, is necessary, since maximum-likelihood decoding is NP-hard. 

In terms of work with related results, Goldreich, Rubinfeld, and Sudan [fTTI considered 
a problem similar to MLD-RS in the context of general polynomial reconstruction prob- 
lems. Thus it is shown in ifTTl Section 6.1] that given n pairs [x\, y\), [xi, yj), . . . , (x n , y n ) 
of elements from a large field, determining if a degree k polynomial passes through at least 
k + 2 of them is NP-hard. However, this formulation does not include the essential re- 
striction that the evaluation points X\, Xi, • • • , x n are all distinct (in fact, the proof of IfTTl 
crucially exploits the fact that x\ = Xj for some i 7^ j), and therefore does not yield any 
hardness results for Reed-Solomon decoding. We show that a problem very similar to the 
one considered in [ 1 1 1 remains NP-hard when the evaluation points X\, Xj_, . . . ,x n are dis- 
tinct. Thus our result can be viewed as resolving one of the main questions left open by [fTTTl. 

The proof of our main result (Theorem[5j is presented in the next section. In Section 3, 
we further strengthen this result by showing that maximum-likelihood decoding of Reed- 
Solomon codes remains hard even if unlimited preprocessing is allowed, and only the re- 
ceived vector y is part of the input. This is a well-motivated scenario, since the code 
(namely, the evaluation set T> and the dimension k) is usually known in advance. Thus 
one-time preprocessing, even if computationally expensive, would be attractive if it leads 
to efficient decoding. We prove in Section 3 (assuming NP does not have polynomial-size 
circuits) that for some Reed-Solomon codes no such preprocessing procedure can exist. 
This strengthens the main result of Bruck and Naor [5] in the same way that Theorem[5] 
strengthens the main result of Berlekamp, McEliece, and vanTilborg @. We conclude 
the paper in Section 4 with a brief discussion, pointing out several simple corollaries of 
Theorem[5]and suggesting a number of interesting open problems related to our results. 

2. MLD-RS is NP-complete 

As in Berlekamp, McEliece, and vanTilborg [4], we reduce from Three-Dimensional 
Matching. Let U = {1,2, ... ,t} and let T be a set of ordered triples over U, that is 
TC WxWxW. A subset S of T is called a matching if \S\ = t and every two triples 
in <S differ in all three positions. As shown by Karp in his seminal paper lfT3l back in 1972, 
the following decision problem is NP-complete. 

Problem: Three-Dimensional Matching 

Instance: A set of ordered triples T C {1,2,..., t} x {1,2, ...,t] x {1,2, ... ,t}. 
Question: Is there a matching in T, namely a subset S C T consisting of exactly t 
triples such that d(s,s^) = 3 for all distinct s,s? € <S? 

We shall write an instance of Three-Dimensional Matching as {t, T}. We hence- 
forth assume w.l.o.g. that \T\ > t + 1 (otherwise, the problem is trivially solvable in poly- 
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nomial time). The following deterministic procedure converts any such instance {t,T} 
into an instance {m,V ,k,w,y} of MLD-RS. 

A. Computing the integer parameters: Setra = 3t,k= |T| - (£+1), 

and w = t. Let n = |T|. 

B. Computing the evaluation set: Let q = 2 m . First, construct the finite 

field Fq — that is, generate a primitive irreducible (over F2) binary poly- 
nomial of degree m which defines addition and multiplication in . Let <x 
denote a root of this polynomial. Then a is a primitive element of Wq and the 
set {1, a, a 2 , . . ., a:" 1-1 } is a basis for F^ over F? . Now, convert each triple 
(a, b,c) GT into a nonzero element of F^ as follows: 

{a,b,c) 1 — ¥ x = ex"- 1 + oc t+h ~ l + a 2t+c - x (1) 

This produces n = \T\ distinct nonzero elements e¥ q . Set the 

evaluation set T> to {x\, Xi, . . . ,x n }. 

C. Computing the target vector: Compute y = 1 + a H h a; m_1 

in F^ . Thus y is the element of Fq that corresponds to the binary m-tuple 
(1,1, ... , 1) in the chosen basis. Now, for each j — 1,2, ... , w+1, compute 

w+l 

y - £ xi 

(=1 

z i = f — — — and <Pj = f EI ( x i - Z 3 ) ( 2 ) 

i=l 

¥i 

Note that tp\, <p2, . . . , <p w +l are all nonzero by definition. Set the target vec- 
tory = (yi, y 2 , ... , y„) to (zi/<pi,z 2 /<P2, ■ ■ ■ ,z w+ i/(p w+ i,0,0, . . . ,0). In 
other words: y; = ztj <Pj for ] = 1,2, ... , w+l, and yi = otherwise. 

We will refer to the foregoing computation as the 3-DM/MLD-RS conversion procedure. 
It is not immediately clear that this procedure runs in polynomial time (note that the con- 
version procedure has to run in time which is polynomial in the size of the Three-Di- 
mensional Matching instance {t, T}, and therefore in time that is polynomial in the 
logarithm of the field size). This fact is, therefore, established in the following lemma. 

Lemma 1. The 3- DM /MLD-RS conversion procedure runs in time and space that are bo- 
unded by a polynomial in the size of the instance {t, T}. 

Proof. Step A is trivial. The only thing that is not immediately obvious in Step B is whe- 
ther a primitive irreducible binary polynomial of degree m = 3t can be generated in deter- 
ministic polynomial time. However, Shoup [ 20 1 provides a deterministic algorithm for this 
purpose, whose complexity is 0(m 5 ) operations in F2 . Clearly, y and Z\,Zi, . . . , z w+ \ in 
Step C can be computed in polynomial time and space. However, it is not clear whether this 
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is also true for cpi, cp2, ■ ■ ■ , <p w +\- Indeed, a straightforward evaluation of the expression 
for cpj in © takes 2 m — n additions and multiplications in Wq . Thus we now show how to 
compute cpj in polynomial time. Define the polynomials 



M(X) 



D(X) 



def 



def 



n (X-/3) = X 1 ? - X 

Pe¥ q 

n (x - p) = t w 

liev i=Q 



(3) 
(4) 



andletG(X) denote the rational function M(X)/D(X). Then <py = G(xj) in view of ©. 
It is easy to see from © and © that Xj is a simple root of both M(X) and D(X). Hence 



G(x, 



where M'(X) and D'(X) are the first-order Hasse derivatives of M(X) and D(X), respec- 
tively. Note that M'(X) = 1 in a field of characteristic 2. It follows that 



for j = 1,2, . . . , w+1 



(5) 



L d 2i+l*j 
i=0 



2/ 



where n' = \_{n — l)/2j and the coefficients do, d\, . . . , d n are elementary symmetric func- 
tions of x\, Xi, . . . ,x n . These coefficients can be computed from © in time 0(n 2 ). Given 
do, d\, . . . , d n , the computation in © clearly requires at most 0(wn) operations in K . | 



Let H = [hi r j] be the (w+1) x n matrix over defined by hjj = x l - for / = 1, 2, . . . , n 
and i = 1,2, . . . , w+1, where x\, Xj_, ■ ■ ■ , x n are given by ©. Explicitly 

1 1 • • • 1 



H 



def 



Xi X 2 

2 2 
1 X^ 



v w v w 

1 -^-^ 



(6) 



The following lemma is a key step in our reduction from Three-Dimensional Match- 
ing to MLD-RS. This lemma owes its general idea to ll23l Lemma 1]. 

Lemma 2. The set T has a matching if and only if there is a vector v£ of weight ^ w 
such that Hv l = (0, 0, 0, 1, y) f . 

Proof. Following Berlekamp, McEliece, and vanTilborg [4J, we first construct the m x n 
(or 3t x |T|) binary matrix V having the binary representations of X\,Xi, ...,x n as its 
columns. As noted in \4\, T has a matching if and only if there is a set of w = t columns 
of V that add to the all-one vector. The latter condition can be equivalently stated over F^ 
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as follows: there is a subset {x^, x, 2 , . . . , Xj w } of T> such that X{ + Xj + • • • + X[ w = y. 
Suppose that T has a matching, so that such a set {x^, Xj , . . . , Xj w } C £> exists, and con- 
sider the matrix 



A 



' 1 


1 


■ 1 


" 


h 


X{ 2 


X iw 





X 2 


x 2 ■ 


• A 





y.W-2 
X h 


xf~ 2 ■ 


■ x?- 2 





X h 


12 


■ x?- 1 


1 


L *fi 


X? 
12 


■ xf 


y . 



(7) 



It was shown in ll23l Lemma 1] that 
detA = (y 



x h - 



OCi, 



n < 



(8) 



Hence A is singular, so there exists a nonzero vector u = (u\, Uj_, ■ ■ ■ , u w+ i) G F™ +1 such 
that Azz' = 0. We claim that 

U-w+l 7^ 0- To see this, replace the last column of A by the 
vector (1, 1, . . . , l) f to obtain the x (w+1) matrix A'. If u w+ i = then A'ti' = 0, 

which is a contradiction since det A' is clearly nonzero (as Xy ^ 1 for all j by (HJ), it is the 
determinant of a Vandermonde matrix with distinct columns). We can now construct a vec- 
tor v — (pi, V2, • • • , v n ) G F" of weight ^ w as follows 







Xj r for some r G {1, 2, . . . , w} 



otherwise 



It should be obvious from ©, © and the fact that Au l = that Hv* = (0, 0, . . . , 0, 1, y)K 
Conversely, assume that there is a vector v G F^ such that Hz/ = (0, 0, . . . , 0, 1, y) f and 
wt(V) ^ w. Write <5 = wt(V) and let z'2, . . . , is} be the set of nonzero positions of v. 
Let {zVi-i, • • • / hv} be an arbitrary subset of {1, 2, . . . , n) of size w — 6, that is disjoint 
from {z'i, z'2, . . . , z',5}. Then, as in ©, we have 



1 

2 



'1 



Xf 










v w- 


2 


x w ~ 

>2 


2 


• x w ~ 


2 





x w ~ 


1 


xf~ 

*2 


1 


■ X T 


1 


1 


x h 




xf 

*2 




■ xf 




r 



(r 



x i2~ 



> n 



(9) 



since the fact that Hv 1 = (0, 0, . . . , 0, 1, y) f implies that the matrix in ® is singular. Since 
X\ , Xj_, . . . , Xn are all distinct, it follows from © that Xj a + x, 2 + • • • + Xj ro = y. This, in 
turn, implies that there is a matching in T, and we are done. | 
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Recall that k = \T\ — (t + 1) = n — (zt> + 1) in our conversion procedure, and let C be the 
(n, fc) linear code over F^ having the matrix H in © as its parity-check matrix. Further, let 
z = {z\, z 2 , ■ ■ ■ , z w+ i, 0, 0, . . . , 0) E F" where Z\, z 2 ,..., z w+ \ are defined by ©. 



Corollary 3. The set T has a matching if and only if the code C 
contains a codeword at Hamming distance ^ w from z. 



def 



{e<EF" : Hv* = 0} 



Proof. In view of LemmaEl it would suffice to show that the syndrome of z with respect 
to H is (0, 0, ... , 0, 1, yY . Explicitly, we need to prove that 



1 1 

X\ X 2 



x\ 



X" 



1 

X 2 



l w+l 





Zl 




' " 




Z2 









23 





















1 




Ziv+l 




r 



(10) 



The easiest way to see that the second equality in dTOl) holds is to regard this as a system of 
linear equations in the indeterminates Z\, z 2 , ■ ■ . ,z w+ \. Let M denote the (w+1) x 
matrix in (fTOb . Since M is clearly nonsingular, the system admits a unique solution, given 
by Zj = det My / det M for / = 1, 2, . . . , where My is the matrix obtained from M 
by replacing the j-th column with (0, 0, ... , 0, 1, y) 1 '. Now 

n ( x b - x «) 



detM 



± 



7 



V 



w+1 

E *. 

;=1 



for j = 1,2,.. . ,w+l 



a,b^j 



as in dU), while det M is the Vandermonde determinant Yli^a<b^w+l( x b 
the expression for Zj in easily follows. | 



Xn). From this, 



The last observation we need is that the code C defined in Corollary |3] is just a generalized 
Reed-Solomon code. Specifically, let us extend the definition of <pi, <p 2 , ■ ■ ■ , <p w +l m © to 
all j = 1, 2, . . . , ft and consider the mappings <p : F^ — > F^ and <p _1 : F^ — > F^ defined by 

def 



(p(Ui,U 2 ,. . .,U n ) = ((pilli, (p 2 U 2 , ■ ■ .,(p n U n , 

<p~ l {ui,u 2 , . . . ,u n ) = f {ui/(pi,U 2 /(p 2 ,...,U n /(p n ) 

Note that <p _1 is well-defined since <p\, <p 2 , . . . ,(p n are all nonzero. Also note that both <p 
and <p _1 are bijections and isometries with respect to the Hamming distance. 



Lemma 4. 



(p-\C) = CJV.k) 



Proof. We will prove the equivalent statement that C is the image of Cq(V, k) under <p. 
Let G = [gi r j] be the k x n matrix over F^ defined by g^j = x l ~ x for all i = 1, 2, . . . , k and 
j = 1, 2, . . . , n. It is well known (and obvious) that G is a generator matrix for Cq(V, k). 
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Hence a generator matrix for the image of <Cq(D, k) under <p is given by G' = \g'- •] where 
Si j = ( Pj x )~ 1 - ft would therefore suffice to prove that G' is a generator matrix for the 
code C, which is equivalent to the statement that B = G'H T is the k x (w+1) all-zero ma- 
trix. By definition, a generic entry of B = \b r ,s\ is given by 

7=1 7=1 7=1 

for r = 1,2, ... ,k and s = 1,2, ... , w+1. Now, let ¥* denote the set of nonzero elements 
in Fo , and define the polynomials 

¥(X) d =- f n( X "^) = X>7 Xi (12) 

pe¥*\v 7=0 

<D(X) = f X¥(X) = H (X-/3) (13) 

0eF,\2? 

By the definition of <pj in ©, we have cpj = O (x,-) = XjW(xj) for all / = 1,2, ... ,n. Sub- 
stituting this in (flTT) . we obtain 

fc, s = £ XyY(xy) ^ +s " 2 = ^^(/S)^- 1 = e'E W^'" 1 (I 4 ) 
7=1 0eF* /3eF* j=0 

where the second equality follows from the fact that ^(/3) = for all /J € F*\D. Finally, 
interchanging the order of summation in (fl4l) . we obtain 

q-n-1 q—n — 1 q—2 q—n—1 q—2 

Ks = !>/ 1 ^' +r+s ~ 1 = E E («0 /+r+s_1 = E^E * < 15 ) 

7=0 /3eF* y=Q 2=0 j=0 i=Q 

where a is a primitive element of IF^ and £ = a^ +r+s_1 . The last summation in (1131) is 
a geometric series which evaluates to — l)/(£ — 1) =0 provided 4^1. However, 

since 2 ^ r + s ^ n, it is easy to see that we will always have 1 ^ j + r + s — 1 ^ q — 2. 
Hence E, = ct- ,+r+s_1 7^ 1. Thus b TjS = for all r and s, and the lemma follows. | 

We are now ready to prove our main result in this paper. Indeed, all that remains to be done 
to establish that MLD-RS is NP-complete is to combine Lemma|4]with Corollary|3] 

Theorem 5. MLD-RS is NP-complete. 

Proof. Note that y = <p~ 1 (z) in the 3- DM /MLD-RS conversion procedure. Since <p _1 
is an isometry, it follows from LemmaH|that there is a codeword c 6 Cq(T>, k) such that 
d(c, y) ^ w if and only if C contains a codeword at distance ^ w from z. By Corollary|3l 
this happens iff the set T has a matching. Hence the 3-DM /MLD-RS conversion procedure 
is a polynomial transformation from Three-Dimensional Matching to MLD-RS. | 
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3. Hardness of MLD-RS with preprocessing 



As noted in J5J[T(>||, the formulation of MLD-RS in the previous two sections might not be 
the relevant one in practice. In coding practice, the code to be decoded is usually known 
in advance; moreover, this code remains the same throughout numerous decoding attempts 
wherein only the target vector y changes. Thus it would make sense to assume that the 
code is known a priori and can be preprocessed for a long time (essentially, unlimited 
time) in order to devise an efficient decoding algorithm. 

In the special case of Reed-Solomon codes, the general observation above reduces to the 
following assumption: the Reed-Solomon code C«(D, k) — namely, the set of evaluation 
points T> = {x\, X2, ■ ■ ■ i x n } C F^ and the dimension k — is known in advance (and can 
be preprocessed for as long as desired) and only the target vector y G F^ is part of the input. 
The corresponding decision problem can be formally phrased as follows. 

Problem: MLD-RS with Preprocessing 
Instance: A target vector y G F" . 

Question: Is there a codeword c G C2»< {T>, k) such that d(c, y) ^ wl 

Observe that the above defines not one problem, but a whole set of problems — one for each 
realization of m, T>, k, and iv . We shall henceforth refer to a specific problem in this set as 
MLD-RSwP(m / D / fc / w). Asking whether a given problem MLD-RSwP(m,P / fc / w) is 
computationally hard makes no sense, since the size of the input y G F" m to this problem 
is at most mn bits, while both m and n = \V\ are fixed. Thus asymptotic complexity ques- 
tions concerning a specific problem MLD-RSwP(m, V, k, w) are ill-posed. 

So what can we do in order to show that maximum-likelihood decoding of Reed-Solomon 
codes is computationally hard even with unlimited preprocessing? Here is a sketch of the 
answer to this question. We can prove that: 

There is an infinite sequence V\,V2i ■ ■ ■ of MLD-RSwP(-, •, •, •) prob- 
lems such that mi ^ mi ^ • • • and \ V\\ < jX^I < 1 " • with the follow- 
ing property: under a certain assumption that is widely believed to be true, (*) 
there does not exist a constant c > such that for all sufficiently large i, 
each problem V{ can be solved in time and space at most [m.[ + \ T>i \ ) c . 

The precise meaning of '"P, can be solved in time and space at most [m\ + \T>i\) c " in © 
is that there exists a circuit Q of size at most (m,- + l^/l^ that solves V[ for every possible 
input y G WJf, where q = 2 m > and n = \T>{\. Observe that we allow different circuits for 
different problems — that is, the circuit Q solving V\ = MLD-RSwP(m ( , D{, k{, Wj) may 
depend on m,, T>i, kj, and Wj. This corresponds to the "nonuniform" version of the class P 
of polynomial-time decidable languages, where one can use different programs for inputs 
of different sizes. The resulting complexity class is usually denoted as P/poly. Thus the 
"assumption that is widely believed to be true" in © is that NP ^ P/poly or, in words, 
that not every language in NP has a polynomial-size circuit. It is indeed widely believed 
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that NP £ P/poly. In fact, it was shown by Karp and Lipton 03| that if NP C P/poly 
then the polynomial hierarchy collapses at the second level, namely U^Z? = I^. For 
more details on this and more rigorous definitions of the terms used in this paragraph, we 
refer the reader to Bruck-Naor Q and to Papadimitriou ifTTl. 

How can one prove a statement such as ©? The usual way (cf. J5J[51[T()]|) to do this is as 
follows. Start with an NP-complete problem 17. Then devise a deterministic procedure that 
converts every instance X of TT into m, V, k, w, and y with the following properties: 

PI. The parameters m,V,k,w depend only on size (X) , the size of the instance X, 
and are constructed in time and space that are polynomial in size (X) . 

P2. The target vector y is also constructed in time and space that are polynomial 
in size (X) , but may depend on the instance X itself rather than only on its size. 

P3. The target {y} is a YES instance of the constructed MLD-RSwP(m, V, k, w) 
problem if and only if X is a YES instance of TT. 

For an explanation of this method and for precise definition of size (X) , we again refer the 
reader to 0O- Here, we take TT to be the Three-Dimensional Matching problem 
introduced in the previous section. In this case, we can assume, as in ifToTl . that the size of 
an instance {t, T} of Three-Dimensional Matching is simply t. 

The following deterministic procedure combines the ideas of the previous section with 
a suitably modified version of a reduction due to Lobstein IfToTl . Incidentally, Lobstein's 
reduction IfToTl is by far the simplest way known (to us) to prove that MLD-Linear remains 
hard with unlimited preprocessing (cf. 0[E1[19|). Given an instance {f, T} of Three-Di- 
mensional Matching, we proceed as follows. 

A. Computing the integer parameters: Set m = 3(f 3 + f), w = t 3 + t, 
and k = 3t 3 - (f+1). Let n = At 3 . 

B. Computing the evaluation set: As in the previous section, let q = 2 m 

and construct the finite field F^ . Let a be an arbitrary primitive element of F^ 
and fix a basis {1, a, a 2 , . . . , ot m ~ 1 } for F^ over F2. Let U = {1,2,..., t}, 
and impose an arbitrary order on the t 3 triples in IA x IA x 11, say [a\, b\, C\), 
(a 2/ b2,C2), ■■■ , (a t 3,b t 3,c t 3). Define* 

( a f+z>/-i + a^t+cj-i + a 3t+j-i for 1 ^ y ^ f 3 



^t+U-fi)-! + gpt+j-l + a 3t+(j+t3)-l for t 3 <j<: it 3 
a 3t+U-t 3 )-l + a 3t+j-l for 2t 3 < y ^ M 3 



(16) 



_ oc 3t +(i-n-i for 3t 3 < j ^ 4t 3 

This produces n = At 3 distinct nonzero elements x\, %i, . . . , x n G F« . Set the 
evaluation set T> to {x\, xj_, . . . , x n }. 



*The evaluation points x\, Xi, ■ ■ ■ , x n may be better understood in terms of the matrix W, defined in 
whose columns are binary representations of x\, Xi, . . . , x„ with respect to the basis {1, a, a 2 , . . . , a'" -1 }. 
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C. Computing the target vector: Letx T = (xi/X2, •••,Xf3)bethe char- 
acteristic vector ofTCWxWxW. That is Xj = 1 if the j-th triple (fly, by, Cy) 
ofUxU x IA belongs to T, and Xj = otherwise. Compute 



r 



def 



3f 

£ a 

7=1 



7=1 



£Y 



2f 3 +3f-l 



E« 7 ' 

7=1 



(17) 



Thus y is an element of Fq that corresponds to the binary m-tuple (1, x^-, X-?-/ 1) 
in the chosen basis, where the first 1 in (1, Xji Xji 1) is the all-one vector of 
length 3f while the second 1 is the all-one vector of length f 3 . From here, pro- 
ceed exactly as in the previous section: for each j — 1,2, ... , w+1, compute 



r 



def 



w+1 



w+1 



and 



def 



! [ (*; " |8) (18) 



|3€F,\D 



Set the target vector y to {z\j <p\, Zij <pi, . . . ,z w+ \/ (p w+ \,Q, 0, . . .,0). In other 
words: t/y = zy/ <py for ] — 1,2, ... , and l/y = otherwise. 

We will refer to the foregoing computation as the 3-DM/MLD-RSwP conversion proce- 
dure. It should be evident from Lemma[l]that this procedure runs in time and space that are 
polynomial in t. Furthermore, it is clear that m, k, w in Step A and V in Step B depend only 
on t. Thus properties PI and P2 above are satisfied, and it remains to prove property P3. 

To this end, consider the m x n (or 3(f 3 + t) x 4f 3 ) binary matrix W having the binary rep- 
resentations of %\, %2, • • • , x n as its columns. By construction — compare with the defini- 
tion x\, %2, ■ ■ ■ , x n in (fToT) — this matrix has the following structure: 



W 



u 











I 


I 











I 


I 








I 


I 


I 



(19) 



where I is the f 3 x i 3 identity matrix and U is the 3t x t 3 matrix consisting of the binary rep- 
resentations of the t 3 triples inU xU xU — that is, the y'-th column of U is the binary rep- 
resentation of a a i~ 1 + + <x 2t+c i~ 1 where (fly, bj, Cy) is the y'-th triple in U x U x U. 

Lemma 6. The set T has a matching if and only if there is a set of exactly w = f 3 



+ f 



columns of W that add to the vector (1, Xji Xj-r 1) > which is the binary representation ofy. 

Proof. Since the order imposed on the triples ofUxU x IA is arbitrary, we may assume 
w.l.o.g. that the triples in T correspond to the first \T\ columns of the matrix U. (=>) Sup- 
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pose there is a vector v € with wt(u) = f 3 + f such that WV = (1, Xj> Xj> ^-Y ■ Write 
v = (v_\,v_i, Th>, V4), where v_i, v 2 , V3, are vectors of length f 3 . For i = 1, 2, 3, 4, let 

def 

Tjj = the weight of the first \T\ positions of Vj 

fjj = f the weight of the last t 3 — |T| positions of Vj 

The structure of the matrix W in (TH21) along with the fact that WV = (1, Xji Xq-> ^-Y imply 
the following relationships 

12 = 11 and f?2 = t)i (since £i + z; 2 = X T ) (20) 

T73 = 1^1 -T]2 and 773 = f\ 2 (since v 2 + ^ = x T ) (21) 

774 = f 3 — |T| and 774 = (since z; 4 = 1 — (u 2 + ^3) = 1 — (22) 

among 771, 772, 773, 774 and 771, 772, 773, 774. Using (l20b . (I2TT) . (|22l) in conjunction with the fact 
that wt(v) = 771 + 772 + 773 + 774 + f)i + 772 + 773 + 774 = f 3 + t, we obtain 

T7i + 3f)i = t (23) 

But wt(^ 1 ) = 771 + 771 ^ t, since Uz^ = l f and the weight of each column of U is 3. In 
conjunction with (l2*3l . this implies that 

771 = t and 771 = 

This means that there are some t columns among the first \T\ columns of U (corresponding 
to the triples in T) that add (mod 2) to the all-one vector. Hence, there is a matching in T . 
(<^=) Conversely, suppose there is a matching in T . We then take v_\ to be the binary vector 
of length f 3 and weight t whose nonzero positions are given by the corresponding t columns 
of U. Setting 

v 2 = X T -H\, V3 = E lf and v A = 1 - x T (24) 

it is easy to verify that the vector v = (v_i, v 2 , V3, U4) satisfies WV = (1, Xj> Xj> 1) 1 and 
has weight t + (|T|— f) + t + (f 3 -|T|) = t 3 + t. | 

To prove that 3-DM/MLD-RSwP conversion procedure satisfies property P3, it remains 
to combine Lemma|6]with the results of the previous section. 

Lemma 7. The set T has a matching if and only if there is a codeword c G Cq(V, k) such 
that &{Cj y) ^ w, where q = 2 m and m, k, V, w, y are the values computed from {t, T} 
in the 3-DM/MLD-RSwP conversion procedure. 

Proof. Let H be the (w+1) x n parity-check matrix in ©, but with X\, x 2 , . . . ,x n now 
defined by ( fToT ). Using Lemma^and proceeding exactly as in LemmaEJ we conclude that 
T has a matching if and only if there is a vector v £ of weight ^ w such that 

Hv l = (o,o,...,o,i,r) f 

where 7 is given by (fT71) . By Corollary |3] and Lemma|4]of the previous section, this happens 
if and only if there is a codeword c G Cq(T>, k) such that d(c, y) ^ w. | 



12 



Theorem 8. There is an infinite sequence of Reed- Solomon codes {C 2 m; (X^fcj)}^, that 
can be explicitly specified in terms of the underlying fields F 2mi , F 2 ,„ 2 , . . . , evaluation sets 
T>\ C F 2 , Bl , T>2 C F 2 w 2 , . . . , and dimensions k\, fc 2 , ■ ■ ■ , such that the following holds: un- 
less NP C P/poly and the polynomial hierarchy collapses at the second level, there is no 
polynomial-size family of circuits {Q}f^i sothatCj solves the maximum-likelihood decod- 
ing problem for the code C 2 '«; (T>j, k[), for all i = 1,2, . . . . 

Proof. Lemma|7]proves that the 3-DM /MLD-RSwP conversion procedure satisfies prop- 
erties PI, P2, and P3. Since Three-Dimensional Matching is NP-complete, this im- 
mediately implies the theorem (see the discussion at the beginning of this section). | 

Theorem[8]is our main result in this section. In plain language, this theorem says that there 
exist Reed-Solomon codes for which maximum-likelihood decoding is computationally 
hard even if unlimited preprocessing of the code is allowed. 



4. Discussion and open problems 

We begin this section with a disclaimer, which also leads to an interesting open problem. 
The 3-DM/MLD-RS conversion procedure of Section 2 produces a specific class of Reed- 
Solomon codes, and Theorem[5]says that there exist codes in this class that are hard to de- 
code (unless P = NP). However, since \V\ = |T| ^ t 3 while |F 2 m — 2f in our convers- 
ion procedure, all the codes in this class use only a tiny fraction of the underlying field as 
evaluation points. Thus our hardness results do not apply if, say, all the field elements (or 
all the nonzero field elements) are taken as evaluation points, as is often the case with Reed- 
Solomon codes. On the other hand, the algebraic decoding algorithms for Reed-Solomon 
codes 01 |T2l |2ll |24| do not take advantage of this fact and work just as well for arbitrary 
sets of evaluation points (such as those produced by our conversion procedure). 

Nevertheless, it remains an intriguing open question whether a similar hardness result can 
be established for Reed-Solomon codes that use the entire field (or a large part thereof) 
as their set of evaluation points. The proof of this (if it exists) will probably require new 
techniques, and might also pave the way for establishing NP-hardness of maximum-like- 
lihood decoding for primitive binary BCH codes. We observe that such a proof would 
immediately imply hardness with unlimited preprocessing, since in this situation the code 
is essentially fixed: only its rate and the received syndrome are part of the input. 



We next record a simple corollary to our main result. It is well known Chapter 10, 
p. 281] that the covering radius p of an (n,k) Reed-Solomon code Cq(V,k) is given by 
p = n — k. A vector y e F^ is said to be a deep hole of Cq(V, k) if the distance from y to 
(the closest codeword of) this code is exactly p. We observe that the value of w in the reduc- 
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tion of Section 2 is n — k — 1 = p — 1, so that we are asking whether there exists a code- 
word c £ (V, k) such that d(c,y) ^ p — 1 . This is equivalent to the question: is y a deep 
hole of Cn(T>, /:)? Hence, Theorem|5] immediately implies the following result. 

Corollary 9. If is NP-hard to determine whether a given vector y £ is a deep hoie of 
a given Reed-Solomon code Cq (T>, k) . 

In fact, it is easy to see from the proof of LemmaElthat the distance from the vector y con- 
structed in the 3-DM/MLD-RS conversion procedure to Cq(D,k) is at least w = p — 1. 
Thus an even more specialized task is NP-hard: given a vector which is either at distance p 
or at distance p — 1 from Cq(T>, k), determine which is the case. Note that the reduction in 
Section 3 still has the property that w = n — k— 1 = p — 1. Thus identifying deep holes 
of a Reed-Solomon code (or deciding whether a given vector is at distance p or p — 1 from 
the code) is computationally hard even if unlimited preprocessing of the code is allowed. 

Concerning the results of Section 3, we observe that a polynomial-time maximum-likeli- 
hood decoding algorithm for some specific Reed-Solomon codes (if it exists) must make 
essential use of the structure of the evaluation sets for these codes. Section 3 shows that, 
assuming NP does not have polynomial-size circuits, there is no generic representation of 
the evaluation points that would permit polynomial-time maximum-likelihood decoding. 



We conclude the paper with two more open problems. First, it would be interesting to estab- 
lish NP-hardness of maximum-likelihood decoding for a nontrivial family of binary codes. 
Straightforward concatenation of Reed- Solomon codes over F 2 „, with (2 m — l,tn, 2 m ~ 1 ) 
simplex (Hadamard) codes does not work, since the length of the concatenated code would 
be exponential in the length of the Reed-Solomon code for our reduction. 

Another important open problem is this. As discussed in Corollary|9j maximum-likelihood 
decoding of Reed-Solomon codes becomes hard when the number of errors is large — one 
less than the covering radius of the code. It is an extremely interesting problem to show 
hardness of bounded-distance decoding of Reed-Solomon codes for a smaller decoding ra- 
dius. At present, there remains a large gap between our hardness results and the decoding 
radius up to which polynomial-time decoding algorithms are known lfT2l ITSl . 
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